Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning

نویسندگان

Borja Fernandez-Gauna

Ismael Etxeberria-Agiriano

Manuel Graña

Catalin Buiu

چکیده

Multi-Agent Reinforcement Learning (MARL) algorithms face two main difficulties: the curse of dimensionality, and environment non-stationarity due to the independent learning processes carried out by the agents concurrently. In this paper we formalize and prove the convergence of a Distributed Round Robin Q-learning (D-RR-QL) algorithm for cooperative systems. The computational complexity of this algorithm increases linearly with the number of agents. Moreover, it eliminates environment non sta tionarity by carrying a round-robin scheduling of the action selection and execution. That this learning scheme allows the implementation of Modular State-Action Vetoes (MSAV) in cooperative multi-agent systems, which speeds up learning convergence in over-constrained systems by vetoing state-action pairs which lead to undesired termination states (UTS) in the relevant state-action subspace. Each agent's local state-action value function learning is an independent process, including the MSAV policies. Coordination of locally optimal policies to obtain the global optimal joint policy is achieved by a greedy selection procedure using message passing. We show that D-RR-QL improves over state-of-the-art approaches, such as Distributed Q-Learning, Team Q-Learning and Coordinated Reinforcement Learning in a paradigmatic Linked Multi-Component Robotic System (L-MCRS) control problem: the hose transportation task. L-MCRS are over-constrained systems with many UTS induced by the interaction of the passive linking element and the active mobile robots.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multistrategy Learning Methods for Multirobot Systems

This article describes three different methods for introducing machine learning into a hybrid deliberative/reactive architecture for multirobot systems: learning momentum, Q-learning, and CBR wizards. A range of simulation experiments and results are reported using the Georgia Tech MissionLab mission specification system.

متن کامل

The Necessity of Average Rewards in Cooperative Multirobot Learning

Learning can be an effective way for robot systems to deal with dynamic environments and changing task conditions. However, popular singlerobot learning algorithms based on discounted rewards, such as Q learning, do not achieve cooperation (i.e., purposeful division of labor) when applied to task-level multirobot systems. A tasklevel system is defined as one performing a mission that is decompo...

متن کامل

Crucial Factors in Cooperative Multirobot Learning

Cooperative decentralized multirobot learning refers to the use of multiple learning entities to learn optimal solutions for an overall multirobot system. We demonstrate that traditional single-robot learning theory can be successfully used with multirobot systems, but only under certain conditions. The success and the effectiveness of single-robot learning algorithms in multirobot systems are ...

متن کامل

Crucial factors affecting cooperative multirobot learning

متن کامل

Stability Improvement of Hydraulic Turbine Regulating System Using Round-Robin Scheduling Algorithm

The sustainability of hydraulic turbines was one of the most important issues considered by electrical energy provider experts. Increased electromechanical oscillation damping is one of the key issues in the turbines sustainability. Electromechanical oscillations, if not quickly damp, can threaten the stability of hydraulic turbines and causes the separation of different parts of the netw...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 10 شماره

صفحات -

تاریخ انتشار 2015

Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning

نویسندگان

چکیده

منابع مشابه

Multistrategy Learning Methods for Multirobot Systems

The Necessity of Average Rewards in Cooperative Multirobot Learning

Crucial Factors in Cooperative Multirobot Learning

Crucial factors affecting cooperative multirobot learning

Stability Improvement of Hydraulic Turbine Regulating System Using Round-Robin Scheduling Algorithm

عنوان ژورنال:

اشتراک گذاری